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Recently, toroviruses and coronaviruses have been found to be ancestrally related by divergence of their polymerase 
and envelope proteins from common ancestors. In addition, their genome organization and expression strategy, which 
involves the synthesis of a 3’-coterminal nested set of MRNAs, are comparable. Nucleotide sequence analysis of the 
genome of the torovirus prototype, Berne virus (BEV), has now revealed the results of two independent nonhomolo- 
gous RNA recombinations during torovirus evolution. Berne virus open reading frame (ORF) 4 encodes a protein with 
significant sequence similarity (30 —35% identical residues) to a part of the hemagglutinin esterase proteins of corona- 
viruses and influenza virus C. The sequence of the C-terminal part of the predicted BEV polymerase ORF ia product 
contains 31-36% identical amino acids when compared with the sequence of a nonstructural 30/32K coronavirus 
protein. The cluster of coronaviruses which contains this nonstructural gene expresses it not as a part of their polymer- 
ase, but by synthesizing an additional subgenomic MRNA. © 1991 Academic Press, Inc. 


In 1982 and 1983 the characterization of two mor- 
phologically similar viruses in fecal material from cattle 
(Breda virus; BRV;(7)) and horse (Berne virus; BEV; (2)) 
was reported. BRV and BEV are antigenically related to 
each other but no cross-reactivity with antisera against 
other animal viruses could be detected (2). Although 
the peplomers on the envelope of the new viruses re- 
sembled those of coronaviruses, the unique nucleo- 
capsid morphology and morphogenesis of BRV (3) and 
BEV (4, 5) justified their classification as representa- 
tives of a new group of animal RNA viruses, the torovi- 
ruses (6, 7). 

During the past four years, we have studied the repli- 
cation strategy and genome organization of BEV, the 
prototype torovirus. The BEV genome consists of a sin- 
gle RNA molecule of positive polarity (8) with an esti- 
mated length of 25-30 kb (8, 9). In infected cells four 
3’-coterminal mRNAs are transcribed from the 3’ end of 
the BEV genome (8, 9). /n vitro translation of subgeno- 
mic BEV RNAs and nucleotide sequence analysis of 
BEV cDNA have revealed that the subgenomic RNA 
species are employed to express the structural genes 
((70, 17) J. A. den Boon et a/., submitted). 

Coronaviruses also express their genetic informa- 
tion from a 3-coterminal nested set of mRNAs (re- 
viewed in (72)). In addition, the corona- and toroviral 
genomes are of similar size (25-30 kb) and display the 
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same basic gene order: 5’-polymerase—spike protein— 
membrane protein—nucleacapsid protein-3' (Fig. 2A). 
Nucleotide sequence analysis of (parts of) the polymer- 
ase genes of the coronaviruses infectious bronchitis 
virus (IBV; (73)) and mouse hepatitis virus (MHV; (74)), 
and the torovirus BEV (75) has revealed that their pre- 
dicted polymerase proteins contain several homolo- 
gous domains (75). Furthermore, their polymerase 
genes consist of two large open reading frames (ORFs) 
of which the more downstream one (ORF 1b) is ex- 
pressed through ribsomal frameshifting (74-76). 

Both similarities at the level of genome organization 
and the presence of homologous replicase protein se- 
quences are taken as indications for common ancestry 
(17, 18). Moreover, two of the three structural BEV pro- 
teins are thought to be related to coronaviral structural 
proteins. The BEV peplomer (P) and the coronaviral 
spike (S) protein are post-translationally cleaved, N-gly- 
cosylated proteins of similar size. Although no linear 
protein sequence similarity was detected, their dispo- 
sition in the viral membrane and their tertiary structure 
are predicted to be analogous; for both proteins dimer- 
ization, probably leading to the formation of the distinc- 
tive club-shaped spikes, has been demonstrated (7 7). 
The structural characteristics of the BEV envelope (E) 
and coronaviral membrane (M) proteins are also strik- 
ingly similar: they are triple-spanning 25K-30K mem- 
brane proteins with comparable membrane topologies 
(J. A. den Boon et a/., submitted). The small BEV nu- 
cleocapsid (N) protein (18.3K; (70)), on the other hand, 
seems to have little in common with its much larger 
(45K~50K; (72)) coronaviral counterpart. 
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CORE d f p ycfkhMNFTVPVQATQSIWSVGKES 
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Fig. 1. CDNA sequence and translation of BEV ORF 4. The preparation, cloning, and sequence analysis of BEV cDNA was described 
previously (9). The termination codon () of the upstream E protein gene (ORF 3) and the initiation codon (>) of the downstream N protein gene 
(ORF 5) are also included in the figure. The conserved putative ‘‘core promoter’ sequences for RNA 4 and RNA 5 transcription are indicated. The 
translation of the region upstream of the ORF 4 initiation codon (used in Fig. 2B) is shown in lowercase letters. The nucleotide sequence data in 
this figure have been submitted to the EMBL nucleotide sequence database and have been assigned the Accession Number X52375. 


In addition to divergence from a common ancestor, 
RNA recombination is considered an important factor 
in RNA virus evolution (77, 78). Homologous recombi- 
nation between highly similar RNA sequences has 
been found to occur during the multiplication of anum- 
ber of plant and animal RNA viruses (19-26). Nonho- 
mologous RNA recombination events (i.e., the incorpo- 
ration of heterologous RNA sequences) have been ad- 
vocated, @.g., to explain the presence of tRNA 
sequences in alphaviral defective interfering RNAs 
(27). Undisputed examples of nonhomologous recom- 
bination in infectious (nonretroviral) RNA virus ge- 
nomes have been described only recently (28-30). 
One of these recombinations (28) involves the gene 
which encodes the hemagglutinin esterase (HE) pro- 
tein of influenza virus C (IVC). Proteins with remarkable 
sequence similarity to the IVC hemagglutinin HE1 sub- 
unit are encoded by genes of murine (MHV) and bovine 
(BCV) coronaviruses (28, 37, 32). Because such a gene 
is lacking in the genomes of coronaviruses from other 
antigenic clusters (e.g., IBV; Fig. 2A), a heterologous 
recombination event involving and IVC-like virus and an 
ancestral coronavirus was postulated to explain the 
presence of an HE gene in MHV and BCV (28). 

In this report we present evidence for two indepen- 
dent nonhomologous RNA recombination events dur- 
ing BEV evolution. It is remarkable that, in addition to 
the evidence for common ancestry presented above, 
also these recombinations associate toroviruses with 
coronaviruses. 

Figure 1 shows the previously unreported nucleotide 
sequence of BEV ORF 4, which is located between the 
E and N protein genes ((9); see also Fig. 2A). The pro- 
tein encoded by this ORF (Fig. 1) shows sequence simi- 
larity to the C-terminal parts of the coronaviral HE pro- 
tein and the IVC HE1 subunit (Fig. 2B). However, the 
ORF 4 product consists of only 142 amino acids (aa), 
whereas both the coronaviral HE protein and the IVC 


HE1 subunit are more than 400 aa in length. The se- 
quence of the ORF 4 product shares 30-35% identical 
amino acid residues with both the IVC and the MHV/ 
BCV HE sequences. The predicted BEV product con- 
tains a hydrophobic C-terminus, but lacks the catalytic 
center of the acetylesterase which is located in the 
N-terminal part of. the protein (33). Five cysteine resi- 
dues in the C-terminus of the HE protein which are 
conserved between IVC and coronaviruses (34) are 
also found in the BEV sequence (Fig. 2B). Possibly, the 
5’ part of BEV ORF 4 has been removed by a recent 
deletion event which did not inactivate the RNA 4 tran- 
scription initiation site (9). The first ORF 4 AUG codon 
would in this case not be the ‘original’ translation initi- 
ation codon. This hypothesis is supported by the fact 
that the similarity with the |VC sequence and, to a 
lesser extent, the coronaviral sequence continues up- 
stream of the present ORF 4 starting methionine resi- 
due (Fig. 2B). 

The ORF 4 sequence similarities do not indicate a 
closer relationship to the homologous gene of either 
coronaviruses or IVC. The IVC HE1 subunit derives 
from cleavage of a HE1—HE2 precursor at an internal 
stretch of hydrophobic amino acid residues (35). Nei- 
ther the BEV ORF 4 product nor the coronaviral HE 
protein contains sequences which are homologous to 
the IVC HE2 subunit. Instead they possess a very hy- 
drophobic C-terminus which may represent the result 
of an adaptation of the hydrophobic HE2 N-terminus to 
become a membrane anchor. Though independent re- 
combination events cannot be excluded, the presence 
of the same C-terminal adaptation in the proteins of 
both MHV/BCV and BEV lends some credibility to a 
recombination involving the ancestors of these vi- 
ruses. Either a coronavirus or a torovirus may have 
been involved in the initial recombination with IVC. 

The second nonhomologous recombination event in 
the BEV genome is quite similar to the case of ORF 4. 
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MHV HE 265 Ls1PsKAICLhKtKrFmPVQVVDSRWssiRQSDNMTA@AC .QLPYC£FRNtsaNYs .GgthDahHGDfhFrq1LSGLLYnvsCiaQQGaF1lYnNVSSs 
BCV HE 268 LtvPtKAICLakr Kd Ft PVQVVDSRWnnaRQSDNMTAVAC QpPYCYFRNettNYv. G. vyDinHGDagFt siLSGLLYdspCfsQQGvFrYdNvssv 


BEV ORF4 -12 dfyPtrsyCfkhimftvPVQaiqSiWsvgkeSDdaiAeAC. kpPECiyfskktpYtvtngsnadHGDdevrqmmrGLLYnssCisagc. htplalyst 
BEV ORF4 -12 dfyPtRSYCFkhMnf tvPVqAiQsIWsvGkeSDdAiaeAc. kpPfCiyf sKktPYtvtngsnadHGDdEwRqmmrGLlYnssC1SaqG. htplalyst 


Ive HE 307 11lmPeRSYCF. dMkekgPVtAvQS IWgkGrkSDyAvdqACl stPgCml iqkqkPY. ig. eaddhHGDqEmRe11sGLdYearC1SqsGwvnetspéte 


+ 


MHV HE 361 WPaYg..YGhCPTAAnI gy .maPvCiYDPLPvILLGv....... LLGiAVLIT. .V£LnvlFydg* 413 
BCV HE 363 WP1Ys. -¥GrCPTAAd Int pdvPiCvYDPLPLILLGi Sew eeies. LLGvAViIL. -VWwLilyPavdngtrlhda* 424 
BEV ORF4 = 85 amlYppmYGsCPqyvklfdgsgsesvdvisssyfvatwv eee LL. .vWvII.1V£iiisFcisn* 142 
BEV ORF4 85 amLyPPmyGsCPqyvKlfdgsgsesvdvissSyfvaTwv..... 11. .vvvII.LvFiilsfcisn* 142 


IvC HE 402 eyL1PPk£GrCPlaaKeesipkipdg1liptSgtdtTvtkpksrifgiddlligLlFvalveagiggyllgsrk 475 


C BEV POLla qsivyADdPtHFls1Pvvn. knFlaafydLQ....e. G£pgkkQvAPHiSltmLkl sdediekVe.. --dilDemv1pnew. vtitNPHamGkhy VcDVeG 


MEV ns30K 1. .mAfADKPNHFINFPLaQFsGFmgkY1kLQsQLvemG 1 DCK1QkAPHVSit 1LDIkadqYkqVefAIQEi IDDlaayEG.dIvFDNPHmLGRCLVLDVrG 
BCV ns32K 1. mavAy ADKPNHF INFPLtQFeGFvlnYkgLQfQL1deGvDCKiQtAPHiS1amLDIqpedYr sVdvAIQEv1DDmhwgBG£qIkFDNPRILGRCiVLDVKG 


BEV POL1a ldsLHdevVsviRehGiacDQkRIWkpHiTigelndv. .sfdkfkdFaisckledc........... dfVElGapKanarYefittlpl¢dinc* 


MHV ns30K 99 £EELHeDiVeilRrrGCtADQSRhWIpHCTvAQidee. .cetkgmqFyhkepF. Y1khnN11tdAgLELVKiGs sKiDGFYcsel SvWcGeR1cY 190 
BCV ns32K 102 vEELHdD1VnyiRdkGCvADQSRkWIgHCTiAQ]ltdaalsikenvdFinsmqFnYkitiNpsspArLEIVKiGaeKkDGFYetivShWmGsRfeY 195 


=== 


D BEV POLla ? TIvgyttwvsstVC...............2Dnthk. hpwFVQiPvneKDPewfhmntql..kdnqW <66 aa> <165 aa insert> RES <1 aa> * 
MEV POL1a = ?_—«s TtnQDSYGGASVCiYCRsrveHPd. .. .vDG1Ck1rGkFVQvP1giKDPVsyvLthdiCqVcgfW <23 aa> RFS <6 aa> * 
IBV POLla seis TpdQDSYGGASVC1YCRahiaHPgsvgn1DGrCqfkGsFVQiPt teXDPVgfcLrnkvCtVCqcW <32 aa> RFS <9 aa> * 


Fic. 2. Comparison of the genome organizations of the torovirus BEV and the coronaviruses MHV and IBV. A, Schematic representation of the 
open reading frames in the 3’ half of the genomes of BEV, MHV, and IBV. The three basic structural genes P, E, and N (BEV) and S, M, and N 
(MHV and IBV) are represented by dotted boxes. Filled boxes indicate homologous domains in the polymerase proteins of toro- and coronavir- 
uses. The hatched (ns) and cross-hatched (ORF 4 / HE) areas indicate the position of ‘‘recombinant” genes in the genomes of BEV and MHV. B, 
Alignment of the deduced amino acid sequence of the BEV ORF 4 product with the C-terminus of the coronaviral HE protein (upper three rows) 
and the IVC HE1 protein (lower two rows). Identical amino acid residues are shown in capitals; —, amino acid identity or conservative substitution 
between BEV and MHV/BCV or BEV and IVC; + , conserved cysteine residue; * , termination codon for translation; the arrow points towards the 
(present) starting methionine residue of the BEV ORF 4 product. C, Alignment of the C-terminal part of the amino acid sequence of the BEV 
ORF 1a product with the N-terminal parts of the MHV ns30K and BCV ns32K sequences. Legend as for B. D, Alignment of a possible conserved 
amino acid sequence motif from the C-terminus of the BEV, MHV, and IBV ORF 1a products (see Fig. 2A). Legend as for B. The distances (in 
amino acids) to the ribosomal frameshifting site (RFS) and the ORF 1a termination codon are indicated. 
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Amino acid sequence comparison revealed similarity 
between a previously reported part of the BEV polymer- 
ase (75) and a coronaviral nonstructural (ns) protein: 
the C-terminus of the predicted BEV ORF1a product 
contains 31-36% identical amino acid residues when 
compared to the N-terminal 190 aa of the MHV ns30K 
(28, 31) and the BCV ns32K (36) protein (Fig. 2C). Like 
the HE gene, the ns30/32K gene, which is located be- 
tween the polymerase and HE genes in MHV and BCV 
(Fig. 2A), is absent in coronaviruses from other anti- 
genic clusters (e.g., IBV; Fig. 2A). Apparently, a se- 
quence related to the 5’ two-thirds of this coronaviral 
ns gene, which is expressed from a separate subgeno- 
mic MRNA in MHV- and BCV-infected cells, has been 
integrated into BEV ORF 1a and is now expressed as a 
part of the BEV polymerase. The expression of the 
MHV ns30K protein in infected cells has recently been 
studied (37), but no information about its role in viral 
replication has been obtained. The suggestion that the 
ns30K protein contains a nucleotide binding motif (28) 
is opposed by the lack of conservation of this postu- 
lated MHV domain in BCV and BEV (Fig. 2C). 

The BEV sequence which is homologous to the co- 
ronaviral ns protein gene is located just upstream of 
the ribosomal frameshifting site (75). In the coronavir- 
uses IBV and MHV this frameshift region (at the nu- 
cleotide level) and the downstream ORF 1b (at the 
amino acid level) are highly conserved (74). The ORF1a 
sequence of IBV has been determined completely (73), 
but from the C-terminal region of the MHV ORF 1a prod- 
uct only about 100 aa are known (74). Also these C-ter- 
minal ORF1a polymerase sequences of IBV and MHV 
are highly similar (Fig. 2D). In addition, a small domain 
of sequence similarity with the C-terminal part of the 
BEV ORF 1a product was identified (Figs. 2A and 2D). 
This similarity is reminiscent of the homologous poly- 
merase domains which were identified in the ORFib 
products of toro- and coronaviruses (75). Although the 
motif is very short, its position, immediately upstream 
of the presumed recombination site, indicates that a 
recombination—insertion event between this region 
and the frameshift area in the BEV genome may have 
taken place. 

Information on the genome structure of corona- and 
especially toroviruses is still quite fragmentary. Al- 
though it is difficult to reconstruct the sequence of 
events which resulted in the present genome organiza- 
tion of viruses like BEV, IBV, and MHV (Fig. 2A), it is 
clear that nonhomologous RNA recombination has 
played an important role in their evolution. Apparently, 
both an ancestor of MHV/BCV and an ancestral toro- 
virus have acquired homologous protein sequences as 
the result of independent recombination events; the 
HE and ns30/32K genes are lacking in other coronavi- 


ruses (which excludes divergent evolution) and the 
corresponding BEV sequences are located at different 
positions in the genome (Fig. 2A). Considering the fact 
that several present-day representatives of both virus 
groups cause enteric infections, direct recombination 
between toro- and coronaviruses during coinfection of 
the same cell seems feasible. However, the involve- 
ment of ‘‘a third party” of viral or cellular origin cannot 
be excluded. 
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